Stochastic Shortest Path Games

نویسندگان

STEPHEN D. PATEK

DIMITRI P. BERTSEKAS

چکیده

We consider dynamic, two-player, zero-sum games where the \minimizing" player seeks to drive an underlying nite-state dynamic system to a special terminal state along a least expected cost path. The \maximizer" seeks to interfere with the minimizer's progress so as to maximize the expected total cost. We consider, for the rst time, undiscounted nite-state problems, with compact action spaces, and transition costs that are not strictly positive. We admit that there are policies for the minimizer which permit the maximizer to prolong the game inde nitely. Under assumptions which generalize deterministic shortest path problems, we establish (i) the existence of a real-valued equilibrium cost vector achievable with stationary policies for the opposing players and (ii) the convergence of value iteration and policy iteration to the unique solution of Bellman's equation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic Shortest Path Games and Q-Learning

We consider a class of two-player zero-sum stochastic games with finite state and compact control spaces, which we call stochastic shortest path (SSP) games. They are total cost stochastic dynamic games that have a cost-free termination state. Based on their close connection to singleplayer SSP problems, we introduce model conditions that characterize a general subclass of these games that have...

متن کامل

Reinforcement Learning for Average Reward Zero-Sum Games

We consider Reinforcement Learning for average reward zerosum stochastic games. We present and analyze two algorithms. The first is based on relative Q-learning and the second on Q-learning for stochastic shortest path games. Convergence is proved using the ODE (Ordinary Differential Equation) method. We further discuss the case where not all the actions are played by the opponent with comparab...

متن کامل

Dynamic Multi Period Production Planning Problem with Semi Markovian Variable Cost (TECHNICAL NOTE)

This paper develops a method for solving the single product multi-period production-planning problem, in which the production and the inventory costs of each period arc concave and backlogging is not permitted. It is also assumed that the unit variable cost of the production evolves according to a continuous time Markov process. We prove that this production-planning problem can be Stated as a ...

متن کامل

Cost allocation in shortest path games

A class of cooperative games arising from shortest path problems is deened. These shortest path games are shown to be totally balanced and allow a population-monotonic allocation scheme. Possible methods for obtaining core elements are indicated; rst, by relating to the allocation rules in taxation and bankruptcy problems, second, by constructing an explicit rule that takes opportunity costs in...

متن کامل

Stochastic Shortest Path Problem with Uncertain Delays

This paper considers a stochastic version of the shortest path problem, the Stochastic Shortest Path Problem with Delay Excess Penalty on directed, acyclic graphs. In this model, the arc costs are deterministic, while each arc has a random delay, assumed normally distributed. A penalty occurs when the given delay constraint is not satisfied. The objective is to minimize the sum of the path cost...

متن کامل